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Introduction 


The research conducted during this six month period can be divided into three areas. In all 
these three areas significant results were obtained. Specifically, 

1. (a) The theory behind the proposed joint source/ channel coding approach was developed. 
This has provided insights into the design of robust source coders. 

(b) A variable rate design approach which provides substantial improvement over current 
joint source /-channel coder designs was obtained. 

2. (a) The Rice algorithm was evaluated and its advantages and shortcomings were examined 
in detail. 

(b) An alternative algorithm was obtained which outperforms the Rice algorithm both in 
terms of data compression and noisy channel performance. 

3. A high fidelity low rate image compression algorithm was developed which provides almost 
distortionless compression of high resolution images. 


li 


Section 1 


Design of Joint Source/Channel Coders 


1.1 Motivation 

A block diagram of a typical communication system is shown below. 



Figure 1.1: A typical communication system 


The source coder removes redundancy from the input thus reducing the amount of information 
to be transmitted. Redundancy is reintroduced in a “controlled” manner by the channel coder. 
By “controlled” manner we mean the redundancy introduced is of a form which can be used by 
the channel decoder. The source coder is generally designed without taking the channel statistics 
into consideration. Similarly the channel coder is designed without consideration of the source 
statistics. This separation of source and channel coder design is justified by a result of Shannons. 
Shannon [1] has shown that when the rate of transmission R is less than the channel capacity C\ 
there exists coding schemes which allow us to drive the probability of error arbitrarily close to 
zero. In this situation the link between the source encoder and decoder is essentially error free. As 
such, the source coder/decoder pair can be designed without any regard for the effect of noise on 
the decoding process. Also, if the source coder output contains no redundancy, it can be viewed 
as samples of an iid process. Thus the channel coder/decoder can be designed without taking into 
account the source coder output statistics. 

These separation arguments break down if, for whatever reason, either of the following happens. 
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1. The input to the source decoder is different from the output of the source encoder, i.e., the 
link between the source encoder/decoder is no longer error free. 

2. The output of the source coder contains redundancy. 

If (1) occurs, the effect of channel error on the decoding process needs to be considered. In 
this situation the source encoder/decoder pair design should be such as to minimize the effect 
of channel errors. This situation has been studied by various researchers [2]-[7]. The situation 
(2) occurs under a variety of situations. Incorrect assumptions about the statistical parameter 
describing the source, results in correlation in the source output [8]. Non-stationarity of the source 
may also cause the appearance of redundancy. This redundancy can be used to correct errors in 
the channel [9]-[12], The source coders which mitigate the effect of channel errors are collectively 
known as joint source/channel coders. 

In this paper we present an approach to joint source/channel coder design. To facilitate the 
presentation some nomenclature is in order. First we redraw our block diagram. 



Figure 1.2: Proposed system 


Note that the channel coder has been removed while the channel decoder has been replaced by 
a marked receiver. The purpose of the receiver is to use the redundancy at the ouput of the source 
coder to provide error protection. 

Let the source coder alphabet be denoted by A where 

A = {ai,a 2 ,...,aAr} 

The source coder output is denoted by y,-, the channel output by y,*, and the receiver output 
by y,. We go about designing the receiver in the following manner. Recall that in the classical 
formulation, the optimal receiver (in terms of maximizing the probability of correct decision) is 
one which selects a to maximize 

P[yi = 


2 






We extend this formulation to decoding sequences of received symbols rather than one symbol 
at a time. To this end we define 

y = (yo,yi,y2,-.-,yi) 
y = (yo.yi, &>•••>$£,) 

Furthermore, we assume that the first coder output symbol yo is known. This can be justified 
by assuming that the first output symbol represents the coder output when the source is quiescent. 
Thus the optimum receiver will maximize 

P[y/£] = ■P[yo,yi,...,yL|yo,-.-.y£] (i-i) 

This can be rewritten as 

P(y/y] = P[yi,\yL-i,---,yo,yo,--’,i/L]' P[yL-i\yL~2,‘--,yo,i/L] • 
p[vl-2 |yL- 3 , • • • , yo. yo, • • • , yt] • • • P(yi |yo, yo, • • • , H) 

•■P(yo|yo,.*-,yLj (i-2) 

Assume that given y n -i , y„ is conditionally independent of y n -k k > l. Then, noting that the 
last factor in (1.2) is unity, and assuming a memoryless channel 

P[y|y] = P(y2|yL-i,yt)]P[(y£-i|yi-2,yL-i].*-P[yx|yo,yi] = ^i=iP[yi\yi-uyi] 

Maximizing i^yjy] is the same as maximizing its log. Thus the optimum receiver maximizes 

iogP(y|y) = X>e P(yily,-i,yi] 

If we call log P[y|y] the path metric then logP[yi|yi_i,yi] is the branch metric. The design 
of a receiver which maximizes a path metric which is the sum of branch metrics is a well known 
problem in several different fields. In the field of communications this is simply the problem of 
design of decoders for convolutional codes. 

While the structure of the receiver is evident, we still have to obtain the values of P[y,/y.-i, y.]. 
Sayood and Borkenhagen [9] have obtained an expression for P(y»|y,-i,y<] in terms of the source 
coder output transition probabilities P[y,|yi-i] and the channel transition probabilities P[y,jy,j as 
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or _ , _ „ - r _ Pirn = fl nly,- = ^j\Pbi = a j|y>-i = °m) 

P [y« — i — ®m > y« — ®nj — ^ ^ i TdF* i i 

1 3 El p \ Vi = aim -1 = «m]-P[y; = a„|y, = a,J 

For implementation, the channel transition probabilities can be obtained by modeling the chan- 
nel as a DMC. The source coder output transition probabilities can be obtained from training 
sequences. 

This particular approach toward using the redundancy in the source coder output is especially 
attractive as it leaves the door open to other joint source channel coding approaches. 

This approach was used by Sayood and Borkenhagen [9] in the differential encoding of images. 
The results were highly satisfying. 


1.2 Design 

The main error correcting power of this method arises due to the variations in the source coder 
output conditional probabilities. To see this more clearly let us examine the conditions under 
which an error is made. Referring to Figure 1.3, assume that the correct sequence of transmitted 
codewords was a 0 a 0 a 0 . An error occurs if the path a 0 aja 0 is selected over the path a 0 a 0 a 0 . This 
will happen if the path metric for a 0 aja 0 is greater than the path metric of a 0 a 0 a 0 . Assume 
yi = a ni fa = a m- 



Then an error occurs if the quantity 

lo P\fa = Qnlyi = Qj]-P[y> = Qjlyo = <*<>] 
° s Ei ^(yi = ai\y« = a»]P[yi = «n|yi = «»] 
+ lo £jj (2 = Qmlyz = <* o]-P[y2 = q<>|yi = q j] 

° g El P[y2 = a/|yi = aj)P[fa = a m \y 2 = a,] 
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( 1 . 3 ) 


P[y\ = a n |yo = ao\P\yi = Qolyo = °o) 
° 6 Ei p [yi - a i\y» = a o]P[y\ = «n|yi = <*i] 

lo P[$2 = Qmly2 = ao]-P(y2 = ao|yi = Qq] 

og Li p [y2 = a i|yi = a o\P[y 2 = «m|y 2 = <*i] 


is greater than zero. Cancelling the indicated terms and rearranging terms we get 


, Plh = Qnlyx = qj] , . Pjyi = lyo = °o) , lo _ £& = °olyi = a i) 

° g P[yi = On|yi = Ooj + g P[y i = a^y* = a 0 j g P(y 2 = = a 0 ] 

_ j D -Pjyz = a<|yi = aj]^[y2 = <»m|y2 = qi 

° g Ei P[y2 - a i|yi = a )] p [yi = a m|y2 = «i] 

Assuming the channel to be a binary symmetric channel and W = log 2 M , the length of the 
codewords 

P[y< = a n |yi = aj) = p^( 1 - p) w - d '> = (1 - P)”' 


where d,j is the Hamming distance between a; and aj and p is the channel crossover (bit error) 
probability. Then 


. P[yi = «n|* = °j) 
8 P[yi = onlyi = Oo] 


= log 


p dn >(l - p) w ~ dn > 

pdno ( 1 — p'jW—dnt 


— (dnj dno) 1°8 \ — p 


( 1 . 4 ) 


define 


gik = P[yi = ai|y»-i = <**] 


Then an error occurs if 



+ log«fc+log*Si 

9 00 9 00 


log 


D(t 

tAfr ) d «9" 


> 0 


let a = 5 then the above can be rewritten as 

P 


(d no - d n j) log or + log ^ + log — - log 

9 00 9 00 


Ei « dml yij 

Li a ~ dml yio 


> 0 


dno d n j > 


1 

log a 


log — + log — + log 


9jo 


9oj 


L^a dni, gij \ 

Ll a ~ dm, 9io) 


> 0 
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The left hand side is maximized when j = n(d n j = 0). In which case an error occurs if 

j ^ 1 (t 9oo , , 9oo , , 

dno > Z log + log + log —T 

logo \ 9 jo 9oj £i a dm, 9ioJ 

If we pick n such that d ^ = min k{dko} then the quantity on the right is the number of bit 
errors that can be corrected by this system in a span of 2W bits. Examining the RHS we see that 
this quantity is zero when the conditional probabilities (p/*) are all the same, i.e., the channel input 
is iid. This validates both the idea that when the source coder removes all redundancy it should 
not be considered for channel error correction, and the thesis proposed here that redundancy in 
the source coder output can be used to correct channel errors. The type of redundancy necessary 
is also evident from the inequality. We wish to increase the variability of gi To state this more 
formally our objective is to minimize H{y n \yn-i)/M where M is the size of the alphabet. The 
next step, of course, is to examine ways designing (joint) source (channel) coders which contain this 
type of redundancy. Before we look into that there is one more interesting observation that can 
be made, a is a decreasing function of p, thus 1/loga is an increasing function of p. This means 
that for a given source (/channel) coder, an increase in the probability of error in the channel will 
increase the number of bit errors that can be corrected. Because the number of bit errors also 
increases, this results in a flattening out of the performance curve. This behaviour was observed 
in [9]. 

1.3 Example 


A more detailed version of the transmitter side is shown below. 


Source Coder 



Figure 1.4: A Source Coder 


The data compression block consists of source coding algorithms which are not information 
preserving such as DPCM, Transform Coding, etc. The data compaction block consists of infor- 
mation preserving or noiseless coding algorithms such as Huffman coding or runlength coding. 
The information preserving compression algorithms are especially vulnerable to channel error as 
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Figure 1.5: Proposed Joint Source/ Channel Coder 


the error may cause timing and synchronization errors which may propogate for extremely long 
periods. To be able to correct errors by the technique presented previously, we need to increase 
the desired type of redundancy. To this end we modify our source coder as follows. 

The objective of the block II is to generate an output sequence y n , such that H(y n \y n ^i)/M is 
minimum. A simple mapping which does that is as follows. Let the input to II be selected from 
the alphabet 


A. — {a 0 , di, . . . , } 


Then let the output alphabet be 

S = {^oj • • • 5 — 1 } 

Note that while the input alphabet is of size iV, the output alphabet is of size N 2 . The 
input /output mapping is given as 

z n — u,‘ , x n —\ — CLj — > y n = 

While we still have to show mathematically that this results in decreasing of H(y n \y n -i)/M y 
we can see the effect. If we look at all pairs (y n = 5i,y n -i = sj) we can see that certain pairs are 
disallowed because of the mapping. For example, the N 2 — N pairs of the form (y n = Vn-i = *j) 
where j ^ kN(k = 0, 1, . . . , N — 1) are disallowed by the mapping. Thus P[y n = -»o|yn-i = *j] = 0 
for j / kNk = 0, 1, . . . , iV — 1. For pairs that are allowed P(y n |y n _i] = P[z n \x n -u *n- 2 ]- All 
this together with the fact that M is now equal to N 2 instead of N means that we have in some 
sense achieved our objective. Another way of looking at this is that because certain sequences 
are disallowed, errors which cause such sequences to be generated will be detected and perhaps 
corrected. 

While the mapping II does seem to increase our error correcting capability what does it cost 
in terms of additional rate? This is easy to answer if we assume that the data compaction scheme 
coding rate is equal to the entropy of its input. In the first case the rate is simply H(x n ). In the 
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second case, as we are actually coding the pairs (x n , x n -\), the rate is H(x nf We can write 

H(x n ,x n _x) as 


H(x ny x n - 1 ) = H(x n ) + H(x n |r n _i) 

Thus the additional cost for this error correcting capability is H(x „ To minimize the 
cost we have to minimize H{x n \x n _i). This is very nice because H(y n \y n -i)/M seems to be a 
motonic function of H(x n |* n -i)- Thus we have identified an objective in the design of the data 
compression scheme: Minimize H(x n \x n -i). 

This system was utilized with an image coding system. The data compression scheme was a 2 
bit DPCM system. The source was a 256 x 256 image. End of line resynchronization was assumed. 
Some preliminary results are shown in Figure 1.6. 

We can see from the figure that there is a substantial improvement in performance at low error 
rates. There is, however, some degradation at higher error rates. The reason for this is that the 
decoding scheme described above has not completely been implemented. As soon as this is done, 
we expect a flattening out of the performance curve. 

1.4 Continuing Effort 

In the next six months we plan to further refine our error correction scheme. This scheme will then 
be used in conjunction with the algorithms developed for coding of the gamma ray information 
(section 2). 
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Figure 6: Performance comparison between standard source coding 

and joint source/channel coding. 
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Section 2 


Efficient Coding and Transmission of Gamma Ray Information from the Mars 
Orbiter 


2.1 Problem Statement 

The output of a Gamma Ray detector is quantized using a 14 bit A/D. The number of each of the 
16,384 output levels occus in a 30 second interval is obtained. The contents of the 16,384 “bins” 
are transmitted using a transmission rate of 600 bits per second. This means that the contents of 
the bins have to be noiselessly encoded using 18,000 bits. 

2.2 The Rice Algorithm 

The proposed coding algorithm is actually a collection of highly efficient noiseless coding techniques 
developed by R. F. Rice at the Jet Propulsion Laboratories [1]. The various techniques are used 
adaptively depending on the changing characteristics of the data. Rice has shown that by adaptively 
selecting the technique best suited to the data, performance close to the entropy of the source can 
be obtained for memoryless sources. He shows this to be true for a wide range of entropies. For the 
range of entropies of interest the set of techniques called the Basic compressor is most appropriate. 
In the following paragraphs we give a brief description of these techniques. Much more detailed 
expositions can be found in [l]-[4]. 

Before any of the techniques due to Rice can be invoked, the data has to be preprocessed 
to remove correlation. The preprocessed data has to be relabeled into the set of non-negative 
intergers. The correlation removal operation suggested by Rice [1] is a simple differencing step. 

Before detailing the techniques that comprise the Basic compressor some definitions are in 
order. 
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Fundamental Sequence : The code operator fs[] is defined by 


« zeros 

/s[i] = oo7?Toi 


Let a: be a sequence of nonnegative integers 


X = 11*2*3 . • -xj 

Then the Fundamental Sequence corresponding to x, FS[x], is given as 

F 5 [x] = fa[x i] * fa[x 2 ] * fax 3 * ... * fa[xj) 

where * denotes concatenation. As an example take the sequence x = 1302 , then F 5 (x] = 

010011001. 

Sequence Extension : Let y be any J sample sequence, then an extended sequence y* is formed 
by terminating y with enough zeros to make the resulting sequence a multiple of e. The e th 
extension of y is simply the grouping of y e into e- tuples. Suppose y = 1101101 and e = 3 , then 
y* = 110110100 and the third extension y e is y* = (110) * (110) * (100). 

Complementation : Given any binary sequence x, the sequence x = COMP [x] is simply the 
bitwise complement of x. 

, Coding a Sequence : Given a binary sequence x, the coded version of the sequence Cx is simply 
the Huffman coded e th extension of x. Thus if e is 3 then Cx is the sequence obtained by coding 
the 3 -tuples of the 3 rd extension using a Huffman code designed for an eight letter alphabet. 

With these definitions we can now proceed to define the four operators which make up the 
basic compressor. These are denoted by the symbols V*o» 0i> 02 1 03- F° r a sequence of non negative 
integers x, the four operators are defined as follows: 

1. 0o[x] = GF 5 (x], i.e., 0o[x] is the coded e th extension of the complement of the fundamental 
sequence corresponding to x. 

2. 0i[x] = F 5 [*], i.e., 0i[r] is the fundamental sequence corresponding to *. 

3 - 02[x] = GF5[x], i.e., 02 [x] is the coded e th extension of the fundamental sequence corre- 
sponding to x. 

4 . 03 [x] is simply the binary representative of x. 
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Some additional overhead must be tacked on to the rp\ and t/> 2 operators. The additional bits 
record the number of zeros added during the extension process. This information is necessary at 
the decoder. While this is not mentioned in [1] we found it necessary in our simulations. 

The Basic compressor functions as follows: The input is partitioned into blocks. Rice suggests 
a block size of 16. We found this to be a good choice and have used it in our simulations. Each 
block is then coded using the “best” operator. The coded sequence is transmitted along with a two 
bit lable (ID) denoting the operator used. The decision as to which operator is to be used can be 
made in one of two ways. The first way is to actually code the block using the four operators then 
select the one which uses the fewest bitrs. The second way is a decision rule proposed by Rice. 
The decision rule functions as follows. Let x be a sequence of J non negative integers. The length 

of the fundamental sequence F is 

J 

F = 

i = 1 

Four functions 70,71,72,73 corresponding to the four operators can be defined as 
70 = fF/3] + 2(F - J) 

■yi = F 

72 = Wl + 2 J 

73 = constant 

The adaptive algorithm then selects the coding operator corresponding to the minimum 7 
2.3 Simulation Result 

The Basic Compressor algorithm was simulated on a VAX 11/785. The data coded by the Basic 
Compressor was generated at the Goddard Space Flight Center by Ms. M. Mingarelli-Armbruster. 
Table 2.1 shows the coding rate for twenty intervals of thirty seconds each. 

The average bit rate required to transmit all twenty intervals is 718.6 bits per second. This is 
considerably higher than the original goal of 600 bits per second. However, if we compare these 
results to the results in [1], we find that the two results are in reasonable close agreement. To do the 
comparison we have to obtain the entropy of the difference original and the number of bits/sample 
used by the Rice algorithm. In the case of this simulation the entropy is .97 bits while the number 
of bits/sample is 1.3. Thus there is a difference of .33 bits. This is only slightly higher than the 
difference of .28 bits obtained by Rice in [1). 
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Table 2.1: Rates for the Rice Algorithm (Target: 18000 bits) 


Interval # 

Total # of bits used 

Required rate 

1 

21647 

721.6 

2 

21385 

712.8 

3 

21530 

717.7 

4 

21562 

718.7 

5 

21666 

722.2 

6 

21424 

714.1 

7 

21841 

728.0 

8 

21630 

721.0 

9 

21719 

723.9 

10 

21568 

718.9 

11 

21308 

710.3 

12 

21509 

716.9 

13 

21633 

721.1 

14 

21822 

727.4 

15 

21296 

709.8 

16 

21701 

723.4 

17 

21058 

701.9 

18 

21312 

710.4 

19 

21713 

723.8 

20 

21888 

729.6 


More disturbing than the fact of a higher than expected rate, however, is the behavior of the 
algorithm in the presence of channel noise. Table 2.2 are from a hundred different runs. It was 
assumed that the receiver was resynchronized every 30 seconds. 

Table 2.2: Effect of Noise on the Rice Algorithm 


Probability of Error 

Mean S queured Error 

Mean Absolute Error 

Number of Errors 

10“ 3 

437.8 

15.8 

15,689 

10" 4 

39.9 

3.3 

10,228 

10~ 6 

4.1 

.4 

1,773 


The number of errors column shows how many of the 16,394 values were received incorrectly. 
As can be seen from the table at a probability of error of 10~ 3 effectively all the received values 
are in error; at a probability of error of 10 -4 about two thirds of the received symbols are in error; 
and, even at probability of error of 10“ 5 more than 10% of the received symbols are in error. 

There are two main reasons for this lack of robustness of the system. Firstly, by its very nature, 
an adaptive system is vulnerable to noise, as an error at the receiver may cause it to mistake the 
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coding scheme used at the transmitter. Secondly the differencing operation necessary to create the 
uncorrelated sequence required by the Rice algorithm, also creates infinite memory at the receiver. 
This means that once an error has occurred, it will propogate until resynchronization occurs. In 
these situations we have assumed that resynchronization occurs at the end of each thirty second 
interval. If this is not true the effect of errors may be even more disastrous. 

Because of our concerns with the Rice algorithm under the current conditions we investigated 
the possiblity of developing an alternative algorithm. 

2.4 Possible Alternate Algorithm 

The first step in our alternative algorithm is also a difference, only it is a leaky differencer. The 
difference signal is obtained as 

d(n) = *(n) — 1)J 

The leakage in the differencer causes error effects to die out in time. The difference signal 
is encoded using a sixteen symbol modified r unlength code. These sixteen symbols can then be 
encoded using either a four bit fixed length code or a variable rate Huffman code. The results of 
using this algorithm to encode the same twenty intervals is shown in Table 2.3 where VR stands 

for variable rate code and FR stands for fixed rate code. 

/ 

If we use the fixed rate code, the bit rate required is 594 bits per second which is below our 
target rate. The use of the variable rate code would require a channel rate of 522 bits per second. 
As both the fixed and variable rate codes will meet the target rate and as the fixed rate code is 
both robust and simple to use, we elected to go with the fixed rate code. 
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Table 2.3: Coding Rates for Alternative Algorithm (Target: 18000 bits) 
Interval Number # of bits used(FR) # of bits used (VR) 


1 

17382 

15733 

2 

17528 

15345 

3 

17784 

15520 

4 

17840 

15691 

5 

18144 

15883 

6 

17504 

15457 

7 

18048 

15882 

8 

18096 

15907 

9 

18132 

15843 

10 

18096 

15695 

11 

17604 

15438 

12 

17728 

15580 

13 

17780 

15581 

14 

18016 

15913 

15 

17564 

15361 

16 

17956 

15872 

17 

17296 

15139 

18 

17688 

15449 

19 

18160 

16033 

20 

18292 

16125 


The effect of noise on this system is tabulated in Table 2.4. 


Table 2.4: Effect of Noise on the Alternative Algorithm 


Probability of Error Mean Squared Error 


10~ 3 

1.336 

10" 4 

0.426 

10“ 5 

0.109 


Mean Absolute Error Number of Errors 


0.329 

3161 

0.111 

1384 

0.035 

459 


Comparing table 2.4 to table 2.2, we see an improvement by an order of magnitude in the 
number of errors and several orders of magnitude in the mean squared error. Especially striking 
is the mean squared error at a probability of error of 10~ 3 . At this probability of error use of the 
Rice algorithm results in an error of 437.8 while the use of the alternative algorithm results in an 
error of 1.3! 

While this algorithm is still in its preliminary stages and requires considerable testing, we feel 
that these results make it attractive enough to pursue in further detail. 
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2.5 Continuing Effort 


We have spent the first six months of this project evaluating the Rice algorithm and developing a 
possible alternative algorithm. In the next six months we will develop error correction algorithms 
for both these algorithms, at which time we will be better able to evaluate both algorithms. We 
also plan to look at ways of combining the two algorithms for improved performance. 
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Section 3 


High Fidelity Low Rate Coding of Images 


Let F be an N x N image segment. F can be transform coded into a 2-dimensional representation 

C = T(F). 

If the transform is linear, C can be represented as a double sum 

N 

Cij = TijuFkt 

k,i=i 

Here the i and k subscripts represent x-transform coefficients in the image and transform spaces. 
Likewise, j and l represent y-transform coefficents. This 2-dimensional transform can be converted 
into a scalar transform by stacking the image and transform matrices, 

AT* 

Cp=EWm (3-1) 

m=l 

where F has been stacked into a vector f of size N 3 x 1, T has been stacked into t a matrix of size 
N 3 x N 3 , and C is a N 3 x 1 vector of the transform coefficients. Eq. (3.1) is attained using the 
following stacking operation on the i,j,k,l indices: 

m = N{l- l) + k 

p = N(i - 1) +j. 

The transform vector c must be quantized if data compression is to be attained, 

c p — Qp(atpC p ) 

where the a p are scaling factors to match the variance of the c p to that of the quantizer codebook. 
In general, a different codebook can be used for each of the N 2 coefficients of the transform. 
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The coding method used here keeps only the 4 lowest frequency coefficients of the transform 
(i.e., DCT, Hadamard) so there are only 4 c p elements. Thus, t is a 4 x N 2 matrix. To map the 
non-zeroed elements of c into c the following quantizer mappings are needed: 

Ci = (?i(aici) dc coefficient 
C2 = Q2(«2C2) 

C3 = <?2(a 3 C 3 ) 

C 4 = <?2(«4C4) 



Figure 3.1: Only 4 transform coefficients are used for each image segment. 

The non-dc elements of c are distributed in a similar fashion and can be quantized with the same 
quantizer. The dc coefficient is distributed differently than the other coefficients and requires a 
different quantizer to attain the best results. For the work done here Q\ is coded to 8 bits and Q 2 
is coded to 5 bits. 

The image can be recovered from the 4 transform coefficients by solving (3.1), 

f = (t T t)~H T c 

If the rows of t are ortho-normal this reduces to f = t T c. 

Let us define two distortion measures to rate the performance of the above 4 coefficient trans- 
form method: 

1) di = max | fi - fi\ 

2) d 2 = E,Ci (fi ~ fi) 2 = E{(f - f) T (f ~ /)} 

Method 1) indicates the largest absolute error between the original and reconstructed images. 


20 




Method 2) is the common MSE measure where the signal-to-noise ratio is defined to be 

SNR(dB) = 20 log 10 
and the peak signal-to-noise ratio is 

PSNR(dB) = 201og 10 

where w is a N 2 x 1 vector whose elements are set to the white level of the image, i.e., 255 for an 
8-bit image. 

Notice that for method 1) | /; - /,| < di for all pixels in f but it can be possible that d 2 << d\ 
if the transform coefficients represent the original image well except for a few pixels where the 
distortion can be large. This is true since di is a measure of average distortion over the entire 
image segment. So a good d 2 value will cause one to think that the image is coded with a good 
match but, in fact, there may be areas of very large local distortion in the image. 

To overcome the shortcoming of the d 2 distortion method, an image segment is coded and the 
d\ distortion is measured, and then, depending whether or not d\ is less than some distortion 
threshold, t, the segment may be subdivided into 4 ^ x ^ segments, F,-,t = 1,2,3, 4, and each 
segment is again transform coded and checked against the distortion threshold. If a segment fails 
the threshold test it is again subdivided until the threshold level is meet or a minimum block size 
for an image segment is attained. 

Notice, that if the minumum block size is 2 x 2, the 4 coefficients of the above method will 
code the image segment exactly, to within the quantization resolution of the coder. 

Method 

The method for image coding by threshold detection as decribed above is outlined below: 

1. Select an image segment F. 

2. Code F by F. 

3. If di(F — F) < t, F is an adequate representation of F. 

If di(F — F) < t, divide F into 4 | x j sub-segments F i,i = 1,2, 3,4, and go to 2) for each 
segment until dj < t or the minimum block size is attained. If the minimum block size is 
attained code F; by F;. 
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Example 

Consider the coding of a 32 x 32 image segment. Let the maximum coding block size be 
16 X 16 and the smallest block size be 4 X 4. Let the original image segment be coded with 8 bits 
per pixel. 

If the 32 x 32 segment is coded with four 16 X 16 blocks the coding rate for each such block is 

5 “ *090 bits/pixel 

16 z 

Let the threshold level for the d\ distortion measure be t = 5. This means that if the coded 
16 X 16 blocks have a d x level < t the maximum distortion at any given pixel in 2 bits (4 gray 
scale levels). For the example at hand let the d\ levels for the four 16 x 16 blocks be as shown in 
Figure 3.2. 


1 

z 

cl =4 

3 



Figure 3.2: Image segment F coded in 16 X 16 blocks. 

In this case, F 2 , F 3 , and F 4 all meet the d x distortion threshold so they are adequately coded 
but, Fj does not meet the distortion threshold, so it must be divided into four 8 x 8 segments and 
coded again. Figure 3.3 represents the d x distortion profile of subsegment F*. 
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1 

Z 


d,=. 1 0 

3 


3 



Figure 3.3: Subsegment F\ coded in 8 X 8 blocks. 

In this case, segments 1 and 3 meet the distortion threshold and are adequately coded but, 
segments 2 and 4 must be divided into 4x4 blocks and recoded. Before doing this, let us calculate 
the coding rate for the original image up to this point into the process. The remaining 16 X 16 
block that has been divided into four 8x8 blocks require 23 bits for each block. So the number 
of coding bits for the entire 32 x 32 image segment is 

3(28) + 4(23) = 161 bits 

and if the smallest block size was 8x8 then the coding rate for this total 32 X 32 image is 

161/32 2 = .157 bits/pixel. 

Since the minimum block size is 4 x 4 and two of the 8x8 blocks do not meet the d\ distortion 
threshold, the segment F must be divided for coding as shown in Figure 3.4. 
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Figure 3.4: The final segmenting of segment F\ 

Now all of the blocks of the original image meet the d\ distortion threshold except for two 4x4 
blocks whose d\ levels are 8 and 6. (If the 4x4 blocks could be divided into 2x2 blocks then 
this situation could be improved, but, for this example, the 4x4 blocks are the smallest segments 
to be coded.) The final number of bits to code the original 32 X 32 image segment is 

3(23) for 3 - 16 x 16 blocks; 

2(23) for 2 - 8 X 8 blocks; 

8(23) for 8 - 4 X 4 blocks; for 

for a total of 299 bits. So the coding rate is 299/32 2 = .292 bits/pixel for a data compression ratio 
of 

8/. 292 = 27 . 4 . 

When the receiver gets these 299 bits of coefficient data it must know how to apply the coefficients 
to reconstruct the subsegments of the 32 X 32 image segment correctly. To do this, a small amount 
of side information must be transmitted. 

Let four bits of side information be transmitted for each image segment that can be subdivided. 
These bits will be set to 1 if the corresponding subsegment is to be divided. The bit is set to 0 if is 
not to be divided. Since only segment 1 of the 32 x 32 image is to be divided, the four bits of side 
information are coded 1000. Since this 16 X 16 segment has two 8x8 subsegments to be divided 
again, namely blocks 2 and 4, the four bits of side information for this subsegment are 0101. Thus, 
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the full string of side information is 1000,0101, for a total coding bit rate of 

2QQ -4- 8 

— = .300 bits/pixel 

and a final data compression ratio of 26.7. 

When the receiver has the side information string it will know that the three 0’s of the first 
block say that the corresponding 16 X 16 segments are to be coded as single blocks. The first bit 
is a 1 so the receiver will need to look at the second side block to see how to subdivide its 16 x 16 
block. In the sid'eblock the two 0’s say that blocks 2 and 3 will be coded as 8 X 8 segments and 
blocks 1 and 4 must be divided into 4x4 blocks. So the 8 bits of side information tell the receiver 
how it must subdivide the original 32 x 32 image segment as shown in Figure 3.5. 



Figure 3.5: The subdivided 32 X 32 image segment with di distortions and required side informa- 
tion. 
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